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Abstract 

The Riemann zeta function on the critical line can be computed using 
a straightforward application of the Riemann-Siegel formula, Schonhage's 
method, or Heath-Brown's method. The complexities of these methods 
have exponents 1/2, 3/8 (=0.375), and 1/3 respectively. In this paper, 
three new fast and potentially practical methods to compute zeta are pre- 
sented. One method is very simple. Its complexity has exponent 2/5. 
A second method relies on this author's algorithm to compute quadratic 
exponential sums. Its complexity has exponent 1/3. The third method 
employs an algorithm, developed in this paper, to compute cubic expo- 
nential sums. Its complexity has exponent 4/13 (approximately, 0.307). 

1 Introduction 

The Riemann zeta function is defined by 

n=l 

It can be continued analytically to the entire complex plane except for a simple 
pole at s = 1. The values of C(l/2 + it) on finite intervals are of interest to 
number-theorists. For example, researchers investigating moment questions or 
more recently links to Random Matrix Theory models often utilize numerical 
evidence about the zeta function on the critical line. Also, the numerical verifi- 
cation of the Riemann hypothesis is clearly dependent on this kind of data. Such 
considerations led to pursuits of efficient methods to compute C(l/2 + it) with 
polynomial accuracy; that is, methods to numerically evaluate C(l/2 + it) for 
t > to, to a fixed positive number, with absolute error bounded by t^^ for fixed 
positive A. Searches for such methods were also motivated from a computational 
complexity perspective; namely, the zeta function is of fundamental importance 
in number theory, so, one may simply ask, how fast can it be computed? 

The aim of this paper is to present new efficient methods to compute zeta. In 
particular, our fastest method, which has complexity f4/i3-Ho(i) ^^^^^ ^ 0.307), 
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improves on the best previously known "complexity bound" to compute 

zeta by a noticeable margin. 

Early examples of methods to compute zeta include the Euler-Maclaurin 
formula (see [O]) and the Riemann-Siegel formula (see |T]). The problem was 
also tackled by Odlyzko [0S| [0|, Schonhage [OS] [5], Heath-Brown [HB], Tur- 
ing |Tu| . Berry and Keating [BK| . and Rubinstein [Rj, among others. 

Most methods to compute the zeta function rely on the Riemann-Siegel 
formula as starting point. One frequently used version of the formula is 

C(l/2 + tt) = 5R ^2e'^(*'^^n-i/2exp(iilogn)j +<S^{t)+R{t) (1) 

where ni := [^/t/{2TT)\ , 6{t) and $(i) are certain real functions that can be 
evaluated in O(logt) time, and the error function R{t) satisfies 

|i?(t)| < OmSt'^^^ for t > 200 (2) 

See [E]. The bound ([2|) suffices for most applications in the literature. There 
exist other more accurate versions of the Riemann-Siegel formula; that is, ver- 
sions with a different <i>(t) and an error function R{t) that declines faster than 
t^^/^. In this paper, we only attack the problem of computing the main sum 
in ^ with polynomial accuracy. In this sense, our zeta methods are essentially 
independent of the precise choice of the functions <i>(t) and R{t). 

Given the computational resources of today, one can easily use the Riemann- 
Siegel formula to compute C{l/2 + it) at a single point for t around 10^^. How- 
ever, due to probabilistic or asymptotic considerations, one is often interested in 
the values of zeta at a relatively large number of points; consider the computa- 
tions of Odlyzko [O] and Odlyzko and Hiary [H0| for example. The amortized 
complexity methods of Odlyzko and Schonhage [OSj are designed to satisfy 
exactly this requirement. Specifically, their methods permit the evaluation of 
C(l/2 + iT + it) at 0(Ti/2) points in the interval t € [0, T^/^] with complexity 

Tl/2+o(l)^ 

Despite the distinct advantages offered by the Odlyzko-Schonhage algo- 
rithms, their methods did not improve the complexity of computing the zeta 
function at a single point. A single evaluation of (^{1/2 + it) still consumed 
0{t^^'^). Schonhage [S] used the Fast Fourier Transform (FFT) and subdivisions 
of the main sum in the Riemann-Siegel formula (see Section 2) to reduce the 
cost of a single evaluation of C(l/2 + it) to t3/8+o(i)^ Later, Heath-Brown [HB| 
presented ideas that could improve the complexity of a single evaluation to 
^1/3-1-0(1) jjg described his approach in the following way: 

The underlying idea was to break the zeta-sum into t^^^ subsums 
of length i^/^, on each of which exp(zt log(7i + /i)) could be ap- 
proximated by a quadratic exponential e{Ah + Bh^ + f{h)) with 
f{h) = 0(1). One would then pick a rational approximation a/q to 
B and write the sum in terms of complete Gauss sums to modulus q. 
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This is motivated by section 5.2 in Titchmarsh, but using explicit 
formulae with Gauss sums in place of Weyl squaring. 

The work of [S], [Hlj . and [HBj (see also [TJ, page 99) makes it apparent 
that, a possible approach to improving the complexity of computing zeta is to 
derive efficient methods to numerically evaluate sums of the form 



This is our approach to improving the complexity of computing zeta. 

The paper is organized as follows. In Section 2, we describe the relation 
between sums of the form and the zeta function. In particular, we de- 
rive methods to compute C(l/2 + it) with complexities ^2/5+0(1)^ ^1/3+0(1)^ ^^^^ 
^4/13+0(1)^ The first method is very simple. The second and third methods rely 
on the existence of "efficient" algorithms to compute "quadratic" exponential 
sums and "cubic" exponential sums with a relatively small cubic coefficient. By 
"quadratic" and "cubic" we mean sums of the form ([3]) with f{x) a polyno- 
mial of degree 2 and 3 respectively. A nearly-optimal algorithm to compute 
quadratic sums has already been derived in [Hlj . As for cubic sums, they can 
be evaluated using the algorithm presented in Sections 3, 4, and 5 of this pa- 
per. Specifically, Section 3 contains an informal description of our algorithm to 
compute cubic sums, Section 4 details the steps involved in the algorithm, and 
Section 5 contains proofs of various lemmas employed in Section 4. 

We wish to make a few comments about the structure and presentation of 
our algorithm to compute cubic sums. This algorithm is the essential component 
of our ^4/13+0(1) method to compute zeta. 

Apart from the use of FFT and a few other ideas, the cubic sums algorithm 
is essentially a generalization of the quadratic sums method of [HI] . Therefore, 
it is helpful to familiarize one's self with the ideas of [HI] first. Also, the goal 
of this paper is to compute cubic sums rather than derive elegant asymptotic 
expressions for them. Thus, for example, in Section 4 we explain that the 
cubic algorithm involves evaluating certain exponential integrals. In Section 5, 
we show how to compute each of these integrals separately. By doing so, we 
obscure the fact it is possible to combine some of these integrals in a manner 
that might collapse them. We do not concern ourselves with simplifying such 
integrals as this does not improve the exponent of the complexity. 

To summarize. Sections 4 and 5 communicate the rigorous details of our 
cubic sums algorithm. If one merely wishes to acquire a taste for its general 
structure without getting too involved in the details, then reading Section 3 will 
likely suffice. 

2 Outline of fast methods to compute ({1/2 + it) 

We show how to compute C(l/2 + with polynomial accuracy. We borrow 
some of Schonhage's [Sj notation. Assume t > 10^ and let /3 e [0,1/4]. We 





(3) 
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mention from the outset that to obtain our ^2/5+0(1) ^ ^1/3+0(1) ^ ^^^^ ^4/13+0(1) 
complexities, the parameter (3 has to be speciahzed to 1/10, 1/6, and 5/26 
respectively. Also, the complexities obtained in this section (i.e. 2/5, 1/3, and 
4/13) are the outcomes of simple optimization procedures that we do not make 
explicit. Let us specify our computational model and some notational issues 
first. 

An arithmetic operation means an addition, a multiplication, an evaluation 
of the logarithm of a positive number, or an evaluation of complex exponential. 
We perform arithmetic using 0((logt)^) bits. We remark that in practical 
versions of our algorithms, many fewer bits will suffice to perform arithmetic. 
We measure computational complexity (or time) by the number of arithmetic 
operations required. This in turn can be routinely bounded in terms of bit 
operations because all the numbers that occur in our algorithm have 0((logt)^) 
bits. Frequently, we abbreviate "arithmetic operations" to simply "operations." 

Let [a:J denote the largest integer less than or equal to x , [x] denote smallest 
integer greater than or equal to x, {x} denote x— [ccj , and log a; denotes log^ x. 
Let exp(a;) and both stand for the usual exponential function (they are used 
interchangeably). We define :— 1 whenever it occurs. Finally, all big-O 
notation constants are absolute. 

We now describe our methods to compute zeta with error bounded by 
^-A+o(i)^ A fixed positive constant. Subdivide the Riemann-Siegel sum ^ into 
0{logt) subsums of the form 



2N 

E 



n 



-1/2 



exp(iilogri), (4) 



where t^^"^ ^ < N < y/t/{2'K), plus a remainder Dirichlet series of length 
0{t^^^^^). We evaluate the remainder series directly. Choose K ^ Nt^^^^"^^ 
say K satisfies Ntf~^^/'^ <K < Ntl^-^/"^ + 1. Rewrite g]) as 



EE 



— g( — — ))_ _^ Jlemainder (5) 

vGV k=0 

where F is a set of [N/ K\ equally spaced points in the interval [N, 2N] and 
Remainder is a Dirichlet series of length 0{K) = 0{t^), which we evaluate 
directly. The main sum in ^ can be rewritten as 

exp(itlogu) exp(iilog(l + k/v)) , , 

Using Taylor expansions, the inner sums in ([6]) can be expressed in the form 

/=0 ■ k=0 \ m=l / 
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Since k/v < K/N < we can truncate both Taylor series in ([7]) to obtain 

(-i)"+ifc'" ^ 



/=0 fc=0 \ "1=1 



+ tkM + 5l (8) 



where 

Choose L ^ M \{\ + 2)/(l/2 - /?)]. Note that M = 0(1). By simple 
calculations, the sum ((8]) is equal to 



i=0 fc=0 \ m=l 

The inner sums in ([9]) can be reduced to the form 

1 / (-l)'"+ifc'" (-1)™+1A;™\ 
—V V A:' exp It V ^ +ity ^ '- 10 

/e=0 \ 771=0 m—ni' I 

where < ^ < Af and 1 < to' < M are integers. Let m' = [1/(1/2 - /?)] . Then 
the sum (1101) can be reformulated as 



1 ^ I ^( -|V77+liL7n\ / °° s( '\\s(m' + 1) l-fYusm' \ 

iyVfe'exp zt y i^i^ — V ^ , ^ (11) 

k=0 \ m=0 I \s=0 ^ ' / 



where 

Af-77l' 



Cfc 



E 



(-l)"TO'fc" 
('^ ^" ™') '^'^ 



Note that + where < 2k/v < 2t'^^^/'^. We can truncate the Taylor 

series with respect to s in (fTTjl after i? = [(A + 2) logt] terms. We obtain 

fc = \ 771=0 J \S = ^ ^ / 

The sums ([T^ can be easily reduced to the form 

K /m'-l 



K Im'-l \ 

— J2 exp ' ^ [-^/^"' V^^"], (13) 



fc = \ 771 = 



where < j = O(logt) is an integer. Finally, suppose any sum of the form (|13p 
can be evaluated with error 0{t^^^^) = 0{K^'^^'^^^^^) in t"*^^' time, where we 
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allow for a precomputation costing ii/2-/5+o(i) operations. Then (^{1/2 + it) can 
be computed in ii/2~/3+o(i) time with error bounded by ^--''+0(1). 

We specialize /3 to obtain the complexities stipulated at the beginning of the 
section. First, to compute zeta in ^2/5+0(1) operations, choose /? = 1/10. Then 
by the previous argument, we only need to be able to evaluate sums of the form 

1 ^ 

F{K, j; a,b) = — ^^k^ exp{2Triak + 27ri6A;^), 

fc=0 

a,6eM/Z, 0<j==O(logi) (14) 

in t°''^^ time, where we allow for a precomputation costing t2/5+o(i) operations. 
To accomplish this, we can directly precompute the functions 

F{K,l]h~b), 0<l^O{logt) (15) 

at all points (a, b) in the set 

L {{d/K, t/K^)\0 <6<K,0<t<K^} 

The cost of the precomputation is — ^2/5+0(1)^ Once the precomputa- 

tion is carried out, Taylor expansions can be used to evaluate F{K, j; a.b) for 
any (a, 6) € [0,1) x [0,1) with error bounded by t-^+°W^ Explicitly, given a 
pair (a, b) which is not necessarily a member of the lattice L, let (a, b) a member 
of L closest to (a, b). Also, let Aa := K{a — a) and A6 := K'^{b — b). Then, 

F{K,r,a,b) = ^-^('^)F(i^,j + 2n-/;a,6)(Aa)'(A6)" (16) 

n=0 ' 1=0 ^ ^ 

It is easily seen that the series indexed by n in can be truncated after 
0{logt) terms. 

Our i2/5+o(i) uiethod is simple to program. The only drawback is the data 
resulting from the precomputation has to be temporarily stored. Considering 
the range of t where the method can be realistically applied, and given today's 
computational resources, the amount of data to be stored is certainly manage- 
able. Let us give an example. 

Suppose we want to compute zeta when t « 10^" with absolute error less 
than 10~^. This is a medium level of accuracy, but it is generally sufficient 
to perform moment calculations and separate zeroes for instance. Since we 
are discussing matters of practicality, let us assume errors in our numerical 
evaluations of the inner sums in ([6]) behave as independent random variables 
of mean zero. This assumption is not needed for the theoretical complexity of 
our method, but it a reasonable and generally reliable one to make, see jO] for 
example. With this setup, one finds via quick accounting that it is enough to 
take < Z < 11 in (jlSp and store the precomputed values of F{K, I; a, b) using 
32-bit numbers. So, the amount of data to be stored is w (8/7) x 10^ x 12 x 4 
bytes, which is about 0.05 Gigabytes. Note however, in order to ensure there is 
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a significant speed-up resulting from the use of our method, one would need to 
precompute F{K, I; d, b) on a denser lattice than L. Since a denser grid implies 
fewer derivatives of F{K, i,d,b) would be needed, the functions F{K, j; a, h) can 
then be evaluated more quickly. For example, if we use a lattice that is 17 
times denser, we would need to store about 9 Gigabytes of data instead. The 
larger 9 Gigabyte dataset is still of moderate size. Moreover, it would allow for 
a speed-up by a factor of 5 to 10 compared to the Riemann-Siegel formula. This 
is a respectable improvement considering the elementary nature of the method. 

To summarize, although our i;2/5+o(i) jng|;];iod is not as fast as those of 
Schonhage and Heath-Brown, it is far easier to implement and its error terms 
are generally easier to control. It is a potential replacement for the straight- 
forward application of the Riemann-Siegel formula for t greater than 10^°, and 
also for lower ranges of t provided the implementation is reasonably optimized. 

To derive our zeta method, choose (3 = 1/6. We need to be able 

to compute sums of the form F{K,j; a, b) in t°^^'' operations but this time the 
upper bound for the cost of our precomputation is only t^/^+°W operations. 
Recently, this author [HI] has derived a polynomial-time algorithm to compute 
F{K, j;a,b) as well as sums of the form 

^ 1 

G(ii:,j;a,6) := V7--exp(27rmfc-|-27ri6A:2), a,beR/1, < j = 0{logt) 
fc=i 

Specifically, it was proved 

Theorem 2.1. There exist absolute constants Ki and K2 such that for any 
e G (0,6"^), any positive integer K , any non-negative integer j, any a,b G R/Z, 
and with an underlying computational model that performs arithmetic opera- 
tions using 0{v'^)-bit arithmetic, where v = u(K,j,e) := (j -|- l)\og{K/e), each 
of the functions F{K, j; a,b) and G{K, j; a,b) can be computed with absolute 
error bounded by 0{h''^^e) using O {ly'^^) arithmetic operations. The big-0 con- 
stants are absolute. 

By choosing e to be some fixed negative power K , we obtain a method to 
compute zeta in operations. There are two important practical advan- 

tages to this method. First, it does not require performing any precomputations 
or storing any data. Second, it is not too hard to implement. 

In Section 4, we take Theorem 12. II a step further. We tackle the sums 

1 ^ 

H{K, j; a, 6, c) := — P exp(27rmA: + 2Tribk'^ + 27rick^) 

fe=0 

when c is relatively small. Our goal is to derive an efhcicnt algorithm to numer- 
ically evaluate H{K,j; a,b,c). We will use this algorithm as the building block 
for a method to compute zeta. Given this roadmap, we prove 

Theorem 2.2. There exist absolute constants K3, K4, and K5 such that for any 
/i G [0,1], any e G (0,6^"'^), any positive integer K, any non-negative integer 
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j, any a,b & K/Z, any c G [0, fC"*^^"^^], and with an underlying computational 
model that performs arithmetic operations using 0{v'^)-hit arithmetic, where 
V = v{K,j, e) :— {j + 1) log(iir/e), the function H{K, j; a, 6, c) can be computed 
with absolute error bounded by 0{h''^^e) using 0(v'^'^) arithmetic operations pro- 
vided a precomputation costing 0{v'^'^ K^^^) operations is performed. The big-O 
constants are absolute. 

In Theorenis l2.ll and l2.2l a bit complexity bound follows routinely from the arith- 
metic operations bound because all the numbers that occur in the algorithms 
have 0{v^) bits. We did not try to obtain numerical values for the constants K3, 
K4, and K5 in Theorem 12. 21 With some optimization, they probably could taken 
to be around 4. Also, in a practical version of the algorithm the arithmetic can 
be performed using substantially fewer than Olv"^) bits and we would likely be 
able to replace v{K,j, e) by j + \og{K/e). 

Theorem 12.21 yields our ^■*/i3+o(i) ^eta method as follows. First, choose 
P = 5/26. Then by the argument at the beginning of this section, we only need 
to deal with sums of the form 

H{K,j;a,b,c), a,6eR/Z, 
0<c<t/N^, 0<j = O{logt) 

Assume K > I. Since K < iVt5/26-i/2^ then c < t^/'^^/K^. Let fi be the solution 
of the equation t^/^^ = isT^. If > 1, then K < t^/'^^. So, we can use FFT to 
precompute the functions 

H{K,j;&Ad), 0<j^O{\ogt) 

at all points (a, b, c) of the lattice 

{{S/K, t/K^, -i/K^) \Q<5 <K,Q<t <K^,Q<-i< t^^^^} 

The cost of the FFT precomputation is K'^t^/^'i+oW < ^4/i3+o(i)_ Qnce the 
precomputation is carried out, Taylor expansions can be used to evaluate the 
function H{K,j;a,b,c) in t°^^^ operations. If ^ < 1, then apply the algorithm 
for cubic sums. According to this algorithm, we merely need to perform an FFT 
precomputation costing 0{K'^'^^°^^^) = 0(<:'*/^'^+°(^^) operations. Lastly, recall 
the error function R{t) in the Riemann-Siegel formula ([T]). Then put together, 
we have obtained 

Theorem 2.3. There exists an absolute constant Kg such that for any t > 10^ 

and luith an underlying computational model that performs arithmetic operations 
using 0{{\ogt)^)-bit arithmetic, the function C(l/2 + can be computed with 
error R{t) using 0((logt)'*'''t'*/^^) arithmetic operations. The big-O constants 
are absolute. 

appears to be compatible with the amortized com- 
plexity techniques of Odlyzko-Schonhage |0S| . This means it is possible to 
modify it so as to permit the computation of OiT'^^'^^) values of Q{\/2 + iT + t) 



8 



for t e [0, r4/i3] in t4/i3+°(i) time. It also appears if certain quite substantial 
modifications to our cubic sums algorithm are carried out, then the i'*/i3+o(i) 
method is capable of improvement to ^3/10+0(1)^ These issues will be explored 
in a separate paper [H2] . 

Finally, we remark that the essential ingredients in the aforementioned meth- 
ods are quite independent from the zeta function itself. They employ specific 
properties of C(l/2 + it) only in so far as they rely on the Riemann-Siegel for- 
mula as a starting point. In the case of our method, the use of the 
Riemann-Siegel formula is not critical. In fact, one can still achieve ti/3+°(i) 
complexity starting with the Eulcr-Maclaurin formula. 

3 An informal description of the algorithm to 
compute cubic exponential sums 

The general idea behind Theorem 12. 21 is to extend the basic approach developed 
in |H1| to compute quadratic (theta) sums. That approach depends on two main 
ingredients; the periodicity of the complex exponential and the self-similarity of 
the Gaussian. By the self similarity of the Gaussian we mean the property 

/oo 
e"*-*' dt = V^e"'/^ aeC 
-00 

In the case of cubic sums, we still have the periodicity but not the self-similarity. 
However, if the cubic coefficient c is relatively small, then there is approximate 
self-similarity. This permits repeated applications of Poisson summation, much 
like what occurs in the quadratic case. The length of the cubic sum is cut 
by at least one half after each application of Poisson summation, but also the 
magnitude of the cubic coefficient grows. So, after a certain number of iterations, 
the approximate self-similarity weakens and it is no longer possible to simulate 
the algorithm for quadratic sums. At this point, we stop iterating and we use 
saddle-point methods to reduce the problem to evaluating sums of the form 

1 *^ 

^ /c' exp{2TTif3ik + 27ri(32k^ + ... + 2TTi(3sk'^) (17) 

fc=0 

where 3 < S — 0{logK). Finally, the Fast Fourier transform is employed to 
precompute a large number of sums of the form (|17[) . 

We elaborate. The initial step in our algorithm is to convert the cubic 
sum to an integral. This is followed by two "while loops," a transformation, 
another "while loop," and finally an FFT precomputation. We overview this 
chain of steps in more detail in the next few paragraphs. Before doing so 
however, we digress to comment that this chain of steps is by no means the 
only path to implementing the essential features of the algorithm for cubic 
sums. In particular, although our initial reformulation of the cubic sums as an 
integral dictates many technical structures in the remainder of the paper, it is 
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completely avoidable. The reason we choose this implementation is because it 
allows us to avoid reconstructing the algorithm for quadratic sums from scratch. 

On to a description of the algorithm for cubic sums. Let e G (0, e~^). Recall 
our assumption in Theorem 12.21 that c e [0, K^^'^~^^'>]. Rewrite H{K,j;a,b,c) 
as the integral 

1 fi/^ 

— / F{K;a- ax,b)Q{K,j;x,c)dx 

where 

1 ^ 

F{K; a, b) := F{K, 0; a, 5), Q{K,j; x, c) — ^ exp(27rmA: + 2nick^) 

and initially a = I. In the first loop, the quadratic and cubic sums F and Q are 
partially "decoupled" . The extent of decoupling is measured by the parameter 
a, which remains positive throughout. The smaller the value of a at the end of 
first loop, the more independent is the quadratic sum F{a—ax, b) from the cubic 
sum Q{K,j; x, c). With each iteration in the first loop the size of c grows while 
the length K of the cubic sum decreases. The loop is terminated when we near 
the region where c is too large to efficiently apply Euler-Maclaurin summation 
to Q{K,j] X, c). Once the first loop is terminated, Q is converted to an integral. 
This yields the expression 

— - / exp{2TTicy^) / exp{~2TTixy)F{K; a — ax, b) dx 
Jo J-l/i 

where a = 0{K-^^-'''>), N ^ 0{K''), and c = 0{1/N'^). In the second loop, 
the algorithm for quadratic sums is applied to F{a — ax, b). It is possible to do 
so despite the presence of an integral sign with respect to a: precisely because 
a is relatively small. With each iteration of the algorithm for quadratic sums, 
the value of a increases, while the length K of the quadratic sum decreases. 
The loop is terminated when a exceeds v{K,j, , at which point the cost of 
further applications of the quadratic algorithm starts becoming significant. At 
the end of the second loop, we are left with an expression of the form 

— — / exp(27rjc?/'^) / exp(— 27rj?/a; — iiriaix — 2ma2x'^)F{Ad; a — ax, b) dx 

Jo J~l/4 

where a > v{K,j,e)^^, M < N/a, and ai, a2 are certain parameters that are 
related to a. Some elementary saddle-point calculations are used to reduce the 
last expression to sums of the form 

1 " 

^ fc' cxp(27ri^ifc + 2TTil32k^ + ... + 27ni/3sfc^) (18) 

fc=0 

where 3 < 5 < 3 + log A^/ logM and < / = 0{v{K,j, e)) are integers. In the 
third loop, we perform a dyadic approximation of M . This reduces the problem 



10 



to computing sums of the form 



^ 2" 

^ ^ fc' exp{27Tif3ik + 2TriP2k^ + ■■■ + 2ml3sk^) (19) 

fe=0 

where 2" = 0{M) and < / = 0{i'{K,j,e)) is an integer. There are some 
relations among the coefficients Ps+3, s > 1 which we exploit in the course of 
performing an FFT precomputation of the sums p9)) . The overall cost of FFT is 
^4+o(i) ^ j^4t,+o(i)_ ^fi-gj. precomputation, the sums (fT9|) can be evaluated 
quickly via Taylor expansions. This concludes our sketch of the algorithm for 
cubic sums. 

From our discussion in Section 2, it is not hard to deduce that the value of 
fj, relevant to computing a sum of the form 

2N , 

exp(logrt) 

is given by 

(1/13) log i 



logA^- (4/13) logt 

In particular, if = then /i = 2/5. For this reason, as well as for the 
sake of definiteness, we prove Theorem 12.21 onlv in the case fi = 2/5. It will be 
clear however that the proof holds virtually unmodified for any < /i < 1 (the 
absolute constants implied by our proof are independent of /i). 



4 The algorithm for H{K, j; a,b, c) 

Let e e (0,e-i), j > 0, and K > 0. Define v{K,j,e) := (j + l)log(if/e). We 
say K is large enough \i K > A{K,j,e) where A{K,j,e) := 50'^i'{K,j,e)^. We 
also define F{N; a, h) := F{N, 0; a, b) and 

1 ^ 

Q{N, j; a, c) := — rS^ F exp(27rwA: + 2Trick^) 

fe=0 

In this section, the quantities j, and e are treated as global constants, by 
which we mean different occurrences of them will denote the same values. For 
this reason, we drop the dependence of A and v on K, j, and e throughout 
Section 4. We use the same computational model as the one described at the 
beginning of Section 2 except now arithmetic is performed using 0(v[K,j, e)^)- 
bits 

Finally, in order to spare the reader some mundane and other straightfor- 
ward details, we sometimes use informal phrases such as "It is possible to re- 
duce/simplify the problem of computing the function XK,j{-) to that of com- 
puting the function Yft:j (-)j " '^i' "Ii^ order to compute the function Xxj(.), it 
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is enough/it suffices/we just need to compute the function Ykj{-)" This means 
there exist absolute constants Ri and R2 such that, for any e e (0,e~-^), once 
the function YK.j{-) is computed for the relevant values of its arguments with 
error bounded by 0(e), then the function Xxji-) can be computed for the 
relevant values of its arguments with error bounded by 0{i'{K,j,e)'^^e) using 
0{i'{K,j,e)'^^) arithmetic operations. The meaning of the phrase "relevant val- 
ues of the arguments" will be clear from the context. We frequently say "the 
function Xxji-) can be computed/evaluated/handled efficiently /quickly /easily." 
As one might expect by now, this means there exist absolute constants K3 and 
K4 such that, for any e G (0, e~^), the function Xkj{-) can be computed for the 
relevant values of its arguments with error bounded by 0{v{K,j,e)'^^e) using 
0{v{K,j,eY^) arithmetic operations. 

4.1 The first loop: decouple the cubic and quadratic sums 

Assume that K is large enough, otherwise the problem trivializes. Let c G [0, K^^^^^] . 
Define 




where 



(-1/2,-1/4), 
(x3,2/3) (1/4,1/2), 



{X2,y2).= (-1/4,1/4) 
(0:4,2/4) (1/4,3/4) 



(20) 



Then 



H{K,j;a,b,c) 



SaU^, c, 1) + S^i'^iK, c, 1) + S^f/K, c, 1) 



By lemmas [01 through l5.4l in Section 5, the functions Sj^j^^j{K, c, 1) and S^f^{K, c, 1) 
can be computed quickly. So, it is enough to show how to evaluate S^j^j{K, c, 1). 
We may assume that if is a power of 2. Define Nm ■= Cm '■= 2^'"c, and 

am := 2-™. Since 



) - Q{Nm,j;x + 1/2, 



then using the change of variable x 2x, we obtain 

^af,ji^->n, C,n, Om) = S^fj{N,n+l, C^+l, ajn+l) + Ra,b,]{^rn, C,n, a^) (21) 



where 




By lemmas [5 . 1 1 through 1 5 . 4i the function R^f j{Nm,Cm,otm) can be computed 
efficiently. Thus, we may iterate the relation (PTjl . After at most OiXogK) 
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iterations, either CmN"^ > 1/A or is not large enough, at which points we 
stop iterating. The latter case is a boundary point of the algorithm; the problem 
simplifies to computing the sums 

K 1/2 

^ exp(27rjafc + 2'iTihk^) I exp(27ria:n — 2'iTiamkx) dx (22) 

fc=l -^-1/2 

where < n < N„i = 0(A) is an integer. By the change of variable x x+l/2, 
the sums ([22]) can be reduced to the form 

f e^^"^^F{K;d + f3x,b)dx (23) 
Jo 

where /3 S [—1, 1]. By Lemma [5.101 functions of the form ((23|) can be evaluated 
efficiently. So put together, we may assume Nm is large enough and that the 
first loop is terminated due to a violation of the condition < CmN^ < 1/A. 

If Cm no longer satisfies the condition < CmNf^-^ < 1/A, then CmNm > 1/A. 
Observe 

1/A < c„^Nl 1/A < (23"c)(i^/2™)2 ^ a„, < AcK^ 

So recalling < c < K'^^^^, it follows < Ai^-3/5 ^his in turn implies that 
Nm = amK < Aif2/5. 

To summarize, let N :— Nm, ct :— a„j, and re-define c :— Cm- Assume 
^3/5 > otherwise, the sum H{K,j;a,b,c) has length 0{A^°/^). So, it can 
be computed directly. Our problem has then been reduced to evaluating 

S^i'^iN, c,a) = ^ [ ' F{K- a - ax, b)Q{N,j- x, c) dx 
iVJ 7-1/4 

with conditions 

1/A < ciV^ < 2/A, A<N<AK^/^ 

A/K <a<AK-^'^ <l/A (24) 

4.2 The second loop: apply the algorithm for quadratic 
sums to F{K; a + ax, b) 



By lemmas 15.11 through 15.31 in Section 5 we have 

"1/4 



^abi a) = — / / exp(27ricy^ - 2Trixy)F{K; a + ax, b) dy dx 

NJ J-i/iJo 

+El,^^{N,c,a) 



and Ej^^ j{N,c,a) can be computed efficiently. So, we only need to show how 
to evaluate the function 

— / / y-' exp{2Tricy^ — 2'Kixy)F{K;a + ax,b) dx dy (25) 

Jo J -1/4 
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where iV, c, and a satisfy assumptions and a,6 e R/Z. 

The purpose of this section is to show how to appfy the van der Corput 
iteration to the function F{K] a + ax, h) in (^5)) . In order to do so, we need to 
introduce some notation. To better understand the motivation for the notation, 
one wiU find it helpful to review lemmas 6.6 and 6.7 in pHlj first. In particular, 
recall the relation 

V26 V 26 26 46 

+R{K,d. + &x,b) + Oie-^), (26) 

which is valid for any integer K > 1000, any x E [—1/4,1/4], and any tuple 
(5,5,6) e [-1,1] X [0,2] X [0,1/4] satisfying 

\d + ax] < \ 2bk + h + ax\ {h + fix, 6) e (0, 2) x [0, 1 /4] 

for all X e [—1/4, 1/4]. The structure of the function R{K,a + ax,b) is fully 
described by lemmas 6.6 and 6.7 in [HIJ; see (|32p. We wish to learn the outcome 
of repeated applications of relation ((26|) . With this in mind, we proceed to define 
our notation. 

For z eC and 9 e {-1, 1}, define 



z if6' = -l 
z if 6* = 1 



By the periodicity of the complex exponential, for reals a, 6, a, and integers 
we have 

F{k;~a^ 5ix±\l2~b± 1/2) = F(i^; 5 + ax, 6) 

Thus, for _fir > 0, there exists exactly one or two tuples (oq, 6o, 6*) e x {—1, 1}, 
which depend only on (a, 6), such that 

F{k; d + ax, 6) = ^e{F{k- do + Oax, bo)), (27) 
(So + eax,bo) e (0,2) X [0, 1/4], for all x e [-1/4, 1/4] 

Given a real pair (a, 6), define {do, bo, 9) be the tuple satisfying P7)) with the 
least possible oq. Since {do, bo, 9) depends only on (a, 6), we may write it in the 
form 

^-(5,6) (ao,6o,6i) 

Also, define the function 

T(..M,.):^g,^,|), MO 

Let {do, bo) '■= {a, 6), a_i :— a, and for integers / > sequentially define 
{ai,bi,9i) *(a/,6/), (a/+i, 6;+i, a;) := T{ai,bi,9i,ai^i) 
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In addition, let q;i__i := 0, a2,-i '■= 0, if^"-* := K, and for integers / > 
sequentially define 

Lastly, let TO e [— 1, oo) be the least integer for which at least one of the following 
conditions is violated for some x g [—1/4, 1/4] 

am < 1/A, if > A, 

+ 26„,+iX(™+i)j (28) 

Before we continue, we want to make some remarks. Since 26/ < 1/2, then 
XC+i) < fs:(0/2. Thus, by construction m = 0{\ogK). Note that the third 
condition in ([^5]) implies 26; > l/if'^'^ for all integers < I < m. Also, for 
< ^ < TO, we have 

ai,i < 2a;^2-'^ < 4ai, 1^2,; | < y ^ aj-i < az, < 1/A (29) 

ai-iK^'^ <N ^aK <ai-iK, ai,i/a, = 0(1), a2,i/«? = 0(1) 

This set of observations will be useful later in this section. 

With the notation introduced so far, the second loop of our algorithm can 
be described easily. The loop consists of exactly to + 1 applications of Lemma 
6.7 in |Hlj to the function F{K; a + ax, b); that is, exactly to + 1 applications 
of relation ([26|l . The final output of the process has the form 

F{K; a + ax, b) = C7™e-2^^ai,„.-2..a.„„.^^(^(m+i). ^ ^^^^ y^^^^ 

+R.miK;a,a,x,b) + 0(6'^^"^^) (30) 

where 

m 

R„,{K; a, a, x, b) ^ Q_^g-2-*ai,_i.-2.za,,,_i.^^^^ (^^^(0 ^ ^i+Oiai-ix, bi)) 

Here, C_i = 0, and the constants Ci, < I < m, are independent of x, com- 
putable in 0(1) operations, and are bounded by 0{\/K). 

In order for the relation ((30)l to possibly be useful, we need to be able to 
efficiently compute the functions 

yj ^2iTicy'^ -2iTixy ^-2TTia-i^l_-iX-2-iTia2,i-ix'^ ^ 

$e, {R{K^'^\ai + diai-ix^bi)) dy dx 

for < ^ < TO. Fix I, so we may drop dependencies on it. Let {w,z) be a 
subinterval of [—1/4, 1/4] over which 

[ai + Oiai^ix + 26/if ^'•'J and [a; + 6';Q;/_ia;] are constant for all x e (u>, z) 
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Since ai-i = 0(1), the interval [—1/4, 1/4] can be written as the closure of the 
union of finitely many such subintervals. So, we can restrict our attention to 
the interval (w^z). By lemmas 6.7 and 6.8 in |Hlj . there exist four constants 

{'^i}i<i<4, and four integers 9i G { — 1, 1} satisfying 

< Xi + OiUi^ix < 1, for all x e {w, z) and 1 < i < 4, (31) 

such that R{K^^'^ , ai + 9iai-ix, bi) is equal to a hnear combination of the func- 
tions 

(A + 6/Q;;_j^a;)''e2iri-L(A+eaj_iK)-27r_R(A+eQi_ia:) 
27riQ(A+flai-ia:)-27r(l-i)r / j -27r(l-i) , t-2nrt 

e / re v^W 



where (A,(?) G {(A^, 0i)}i<i<4, plus an error bounded by 0{AK ^e). Here, the 
integers B, L, R, Q, d, and r satisfy 

Q e {0, S e {-1, 0, ifC+i), ifC+i) + 1}, 

0<p = O{v), if(') < i < i^f') +p 

0<(7 = O(zy), K'-^'^ < R < K^^^ + q 

0<d^O{iy), 0<r^O{i^) 

The length of the linear combination is 0(A). The coefficients in the linear 
combination can all be computed in 0(A) operations, are bounded by 0{K), 
and of course are independent of x. Also, big-O constants are absolute. 

If f{x) assumes any of the first three forms in ((32l) . then by Lemma [5T5l the 
integral 



can be computed efficiently. If f{x) assumes the fourth expression in ([32]) . then 
make the change of variable A + 9ai~ix —>■ x followed by y — > y/ai-i. This 
yields an integral of the form 



1 

Ni Jo 



yj ^2-iTiciy +2iTiriy I ^2iTi(T±y)x~2TTipx ~2ttRx ^3^>| 



where if^'^ < Ni < K, CiNf < 1/A, {w,z) C [0, 1], and, by set of observations 
pS]). the real numbers (3, rj, t are bounded by 0(1), 0(1), 0{Ni) respectively. 
Once again, Lemma [575] shows that the integral (|34p can be evaluated efficiently. 
In fact, the Lemma proves a much more general statement. 

Lastly, suppose f{x) assumes the fifth expression in ([32]). For the sake of 
clarity, let us assume r = and = in that expression. The treatment of 
the case when r or d is positive is similar. Once again, we make the change of 
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variable A + Oai^ix — *■ x followed by y — > y/ai^i. This yields an integral of the 
form 



y]^2^^c^y +2^rr,y / ^27rt{ri±y)x-2nt0x I dt dx dy (35) 

Jwi Jo 



1 

Nl 

where ti is a real number bounded by 0{Ni). The treatment of the integral (|35p 
is completely elementary, albeit quite tedious. This is the reason we relegate 
the evaluation of many of the integrals that occur in this paper to Section 5. 
Despite this, in the next few paragraphs, we show how one can begin to tackle 
the exponential integral psp . This might give some idea of the type of manip- 
ulations we use to simplify such integrals. Note however, the actual evaluation 
of these integrals employs some basic saddle-point techniques. Although we do 
not discuss this in the present section, it is discussed in detail in Sections 4.3 
and 5. 

So, first observe we may assume wi > A^/lFi in ([55)) . otherwise, for all 
X e [wi, A\/2bi], the cross term g-'^^('^^'^)'-^t/V2h (.a,n be routinely eliminated via 
appropriate changes of variable and Taylor expansions, which yields integrals 
of the type handled by Lemma 15.51 We integrate with respect to t in (|35p to 
obtain integrals of the form 



/ yig2Ticii/^+2irii;i/ / _ ^2-Ki(T2±y)x-2TTipx'^ -2tt^x j-gg-j 
Jo Jwi X 



where T2 — 0{Ni) is a real number, and 7 G {0, l/\/26l}. Since the case 
7 — l/y/2bi is negligible, we can take 7 = in Apply the change of 

variable x — s- x/wi, and note \/2bi/wi ~ 0{1). Then, we arrive at 

— — / yig27r«cii/^+2irij7y / ^^2iTiwi[T2±y)x-2T!i0w^x dv (37) 

NiJo Ji X y \ > 

By hypothesis, zi/wi < VlC^. So, the integral ([37|) can be subdivided into 
0(log zi/wi) = 0{logK) consecutive subintegrals of the form 

— — / yJg2T«cii/^+2;rij)i/ / _„2Triwi{T2±y)x-2TTif3wlx^ i i /qo^ 

NiJo Ja X y K > 

where 1 < ^ < zi/wi and < A < A/2. Next, we make the change of variable 
x ^ X — A ui ([55)1 . This replaces x~^ with [x + A)~^ . We factor out A from 
{x + A)^^ , and then apply Taylor expansions to the resulting term (1 -\- x / A)^^ . 
The said series of manipulations reduces ([55)1 to integrals of the form 



1 



yj ^2Triciy^+2TTir]iy I ^s^2-Riwi(T3±y)x-2-Ril3wfx''^^^y j-gg-j 



N( Jo 

where < s ~ 0{i') is an integer, and T3 — 0{Ni), -qi = 0{^/Ni) are real 
numbers. To conclude, make the change of variable wix x. This leads to an 
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integral of the form 

{wiAy+^NfJo Jo y \ I 



The integral (|40|) can essentially be handled via Lemma 15751 

Finally, Lemma 15.51 shows how to deal with a violation of the second con- 
dition in ([28|) . Furthermore, a violation of the third condition in (I28p can be 
handled by appealing to Lemma 15.91 In particular, we may assume when the 
quadratic algorithm terminates that if^^+i) > A and Um > 1/A. 

To summarize, let M := iir^™"''^^ a := Um, cti ■= ai,m, and a2 '■= a2,m- We 
can assume m > 0, for else, the third condition in (^5]) is violated for to = —1, 
in which case Lemma 15.91 takes care of the situation. A simple estimate then 
yields |a2,m| = |a2| > l/K"^- So put together, our problem has been reduced to 
evaluating 



1 



N ^1/4 

/ exp{2nicy^ — 2-Kixy) exp{2nia2X^ — 2-Kiaxx) x 

J-l/4 



where 



F{M;a + ax,b)dxdy (41) 

N<AK'^^^, A<M<N/a, l/A<a<N/M 
0<ai<4a, 1/A < cTV^ < 2/A, <\a2\<a (42) 

4.3 Some saddle-point calculations 

We want to compute ([H]) with conditions Let us explicitly write (|4ip as 

1/4 
M 

exp(27ri(a + ax )k + 2TTibk'^) dx dy (43) 

Define T := [A^J. Then, split the sum over k in (|43|) into three possibly empty 
subsums: one that consists of the terms T < k < M — T, another that consists 
of < A; < T, and a third subsum consisting oiM — T < k < M. By Lemma lSTSl 
the last two subsums can be computed quickly. We can assume the first subsum 
is not empty, otherwise, the algorithm terminates. By Lemma l5.6i for each term 
in the first subsum, the domain of integration with respect to x can be extended 
to (— oo, oo). Each of these terms thus becomes 



1 /■1/4 

— — / / y^ exp{2nicy^ — 2'Kixy) exp{2nia2X^ — 2maxx) x 

Jo J -1/4 



-j^ f-N I'OO 

—7 / J/"' exp(27ricj/^) / exp {2i:iakx — 2'Kixy — 2'Kiaix + 2Tria2x'^') dx dy 

Jo J — oo 



exp {sgn{a2)'-f 



y/2\a2\N^ Jo 



N 



z/ exp(27ri/fc(y)) (44) 
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where 

'a^^'^^-i 1 if:,>0 



and 



Jk{y) = fk{y,c,a,ai,a2) cy \ y 



4a2 2a2 4a2 
Here, the easily-provable formula 

, pSgn(y)TTi/i 2 ,,. . 

e2^"*+2''*^*' dt = - — /f'^^) , a;, y e M, 2/ 7^ 
V2|y| 

was used. We want to extract the saddle point contribution from To this 

end, define 

Vk = 2/fc(c, a, ai, 02) := ^^^^ (^1 - \/l - 24cQ;2(aA: - ai)^ 

By choice, j/fe satisfies f'i^{yk) = 0. ft might be helpful to keep in mind that 

48 

\2Aca2{ak - ai)\ < 



MA 



In particular, 2Aca2{ak — ai) is "relatively small." The integral (|44| can be 
written as 

exp (ggn(a2)f) ,„ , ^ ,„ ., , , 

— exp [2m}k{yk)) / r exp {27nhk(y - yk)) ay 

y'2\a2\Ni Jo 

where 

hk{y) = hk{y, c, a2) := cy^ + (^3cyk - 
Some algebraic manipulations give 

fkiVk) = gg4g2Q,3 ((-'^ ~ 24cQ;2(afc - ai))^^^ - 1 + 36ca2(Qfe - ai) 
-216c^a2(afc - ai)^) 



"2 ;=3 

f;(24)'g,c'-24-^ Y: (l) hiy-'a[~^a^k^ 

1=0 s=0 

oo oo //I \ 



s=0 1=0 



Ydsk" (45) 



s=0 
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where 

30-0, .gi = 0, 32 = 0, |g,|<l for / > 3 (46) 

Note that gi depends only on I. Since \qs\ < 2^+^,|q;2| < a, cN"^ < 1/A, and 
ak < N, it follows 

|4+3fc'+^| < 2(48)3ca3fc3(48)''c''a2"M'' < ^ for s > (47) 



JV 



Ik i '■= — , — / y-' exp {2TTihk{y — %)) dy 



Define 



Using the notation defined so far, the sum ((43|) has been reduced to a sum of 
the form 

M-T 

exp [2'Kiak + 2Tribk'^ + 2'Kifk{yk)) h.j (48) 

fc=T 

By Lemma 15.71 the computation of the sum can be reduced to evaluating 
sums of the form 

1 

— E fc'^ exp(27rm/c + 2mbP + 2^i/fe(yfe)) (49) 

fc=0 

where < r = 0(A) is an integer. By estimate (|47|) . we can truncate the 
sum indexed by s in ([iS]) at S 3 + [log A^/ log MJ . The remainder term 
"^1^=3+1 dsk^ 1 which is bounded by 0(1), can then be routinely eliminated via 
Taylor expansions; see Section 2 for a similar calculation. This results in sums 
of the form 

1 *^ 

j^i^k'- exp(27ri/3ifc + 2Trif32k'^ + ... + 2ml3sk^) (50) 

fc=0 

where /3s are real numbers, = 3 + [logiV/ log AfJ , and for s > 
< ; = O(A^), A < Af < AAf < K^K^I^ 

N 

|/3i|e [0,1/2], 1/32 1 e [0,1/2], |/3,+3|<_^ (51) 



The sum (|50l) can be viewed in an alternative way. This viewpoint will be 
useful in Section 4.5 when M is small relative to A^. Recall that f3s = dg for 
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s > 3, where dg is defined as in So, for s > 1, the coefficients (3s+3 have 

the form 

oo 

Ps+3 = ?7s := A'* ^ zi^sl^ 

1=0 

< 1/2, 2(48)^ca^ A := 2caa2, 7 := 48caia2 

and z/^s depend only on / and s. Therefore, the sum (jSOp can be presented as 

1 ^ 

^ A:' exp(27ri/?ifc + 27ri/?2fc^ + 2ml3'ik^ + 27ri/i7/ifc'* + . . . + 2Trifiris-3k^) (52) 

/c=0 

and for s > 1 we have 

AT 1 1 

In particular, if /i. A, and 7 are determined, then so are the coefficients firjs. 

4.4 The third loop: the FFT precomputation 

We want to precompute sums of the form ||5D|) with conditions ([?T|) . So, let n 
be the largest integer satisfying 2" < M and let R be the smallest integer that 
satisfies 2^ > A^if^/^. Note that n < R, and i? depends only on K and A. We 
may assume that n > 0. 

Instead of evaluating ([SO]) , let us evaluate the more general sum 

1 ^ 

F,(M, M; /5) := + exp(27ri/3iA: + . . . + 2Tripsk'^) 

where q, M > are integers, q + M < M, (3 :— {Pi, . . . , /3^), and 
l3 + R/n\ , |/?,| <min{l/2,2^-22«(3-2.)| 

To this end, let go := 0, Mq := M, no := n, and /S^"^ := /3. For integers 
d> 0, and for as long as Md > 1, sequentially define 



Md+i ■■= Md - 2"'', Ud+i ■■= the largest integer satisfying 1 < 2"''+i < Md+i 



p—s 



Note Md + Qd < M, Md+i < Md/2, and Ud < n - d. With this notation, we 
have 

F,,{M,Md-J^'^) = F,,(Af,2"'' - 1;;3('^)) +CrfF,,^,(M,Md+i;/3('^+i)) (54) 
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where Cd is computable in 0{S'^) operations and satisfies \cd\ = 1 

We want t 
the inequality 



We want to obtain a bound on the coefficients (3i'^\ By induction, suppose 



|^(<i)| < 2^-22"(3-2.)2^'i(^ ^ l/{2R)f (55) 

holds for 3 < s < S. Note condition ([55)1 is satisfied for d = and 3 < s < S . 
For n > log2 i? + 4 we have 



< 2^-22«(3--2s)2s(d+i)(2 + l/(2i?))'^+i 



So, estimate ([55)) holds for such n. If n < logj i? + 4, then M < 32R, and the 
sum can be evaluated directly. 

Using Taylor expansions, the evaluation of Fq^(Af, 2"'' — I;/?''*-') is hence 
reduced to computing sums of the form 



V /c"exp 27ni^A: + 27ri^fc2 + ... + 27ri^^fc' 



where < m = ©(A"^) is an integer, and by estimate (|55|) . the integers ct- 
satisfy 

< 2«.-i, < 22"^-i, < min{2---i,2«-i2(3-^)«, } 

provided logg i? + 4 < n < R. Finally, to evaluate F^^^^ (M, M^+i; /3('*+^'), 
simply iterate the relation ([53]). 

In summary, after at most R — O(logiir) iterations of relation ([5^ . our 
problem is reduced to evaluating the sums 

^ E fc'-pN^fc + ... + 2.^^fc^j (56) 

with conditions 

2^<2A^K^/'^, l<n<R~l, S^[3 + R/n\ 
0</ = 0(A3), 0<d<7i-l, l^i'^ll < 2("-'')~i 

< < niin{2^("-'^)-\2^-i2(3-^)«| (57) 

4.5 The cost of the FFT precomputation 



We use FFT to precompute the sums ()56l) with conditions ([57|) . Note that for 
< s < + 3J , we have R + 3n — sn > 0. So, for a given n, d, and /, the 
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FFT cost is at most 

[3+R/ni 

Yl min{2''",2«+3"-''"} (58) 

s=0 

But mill {2''", 2^+3"-''"} = 2"" for s < \R/2n + 3/2]. A straightforward cal- 
culation then reveals the quantity ((58)) is bounded by 

2(Lfl/n+3j-r-R./2n+3/2]+l)(fl/n+3-ri?/2ri+3/2])n ^ ^gg^ 
rfl/2n+3/2] -1 lB./n+3\ - [i?/2n+3/2] 

Jl 2"" Jl 2 



s=0 s=0 



Note that < [R/n + 3\ — \R/2n + 3/2], so neither of the products above is 
empty. 

Define f{n) := n(i?/2n + 3/2)^. By Lemma [5^ the quantity ([55]) is at most 
2-'^("). But f{n) < 4R exactly when R/9 < n < R. So, for each R/9 < n < R, 
< d < n — 1, and < Z = 0(A'^) the cost of the FFT precomputation is at 
most 2'*^. Therefore, the total cost for all R/9 <n<R, 0<d<n — 1, and 
< / = 0{A^) is less than k^K^I^ operations. 

If n < R/9, then the length of the remaining sum is at most 2^/^ < KK"^/^^ . 
It is possible to directly evaluate such sums. However, if we do so, a simple op- 
timization procedure then shows that the complexity of the resulting algorithm 
to compute the zeta function will be only ^37/117+0(1)^ -y^g want to avoid direct 
computation of these sums so that we are able to achieve t''/i3+o(i) complexity. 
This is where the alternative viewpoint ([5^ with conditions ([55]) . which was 
discussed in Section 4.3, becomes helpful. 

According to formulation ([5^ . the sums to be evaluated have the form 

1 

— ^ fc" exp(27ri/3i/c + 277^/32^^ + . . . + 2TriPsk^) 

where A < M < 2^/^, < it = 0{A'^) an integer, 5 = [3 + logA^/logMj, and 
for integers 1 < s < 5* — 3 

C30 

/3s+3 = flVs, Vs ■= A'' ^ Zi^sl'' 

1=0 

Recall that the coefhcients z; s depend only on I and s. They also satisfy the 
uniform bound \zi^s\ ^ 1/2- According to set of conditions ([55)l . the parameters 
/i, A, and 7 conform to the bounds 

1^1 < 2^-^P, \X\ < 2-^P, I7I < 2-2p, l^^^l < 2«-3p-23P 

where p := log2 M. By hypothesis, p < R/9 + I. 

We perform "perturbation analysis" on /3s+3 for s > 1. To this end, let ei 
be a real number satisfying |ei| < 2"^''. Then 



00 



(M + ei)A^2^z/,,7' 



2^P+'P = p7js2^P+'P + 0{ei2''^-''^P) 
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Since s > 1, Taylor expansions allow us to discrctizc fi using step size 2 ^p. 
Since < 2^~^p, this yields 2^^^ possible discretizations for /j,. Similarly 



IJ,{X + €iy^Zi,. 



23p+sp = ii'ns2^P+^P + O {ei2^2^+'^P-^P) 



Therefore, we need to determine A only up to a multiple of 2 ■'^ (note this is 
less than ei). Since |A| < 2~'^p, this gives 2^~p discretized values of A. Finally 



MA'^^i,.(7 + ei)' 



23p+sp ^ iM],2^P+''P + O (ei2^- 



Thus, it suffices to dissect 7 using 2^ ^ step size. Since I7I < 2 "^p, there are at 
most 2^~^P possible discretizations of 7. 

To conclude, for each u, S, and M the cost of FFT is 0{2^P2^^-^p) = 0{2^^+p). 
Since p < R/9 + 1, there are 0{A'^R2^/^) possible tuples {u,S,M). It follows 
that the total cost is 0{A'^R2^^+^^/^) = 0{K^K^I^). 



5 Auxiliary results 

Suppose K > Q, j > and Af > are integers, and c is a real number. Let 
B^m denote the Bernoulli numbers. Recall that i?2m = 0((27r)~^'"(2rn)!). Let 
fx,c,j{y) and Em denote a function and a quantity satisfying 



fx,cAy) = ^ exp(27ricy=^ + 2mxyl \Em\ < \ftZ^^\y) dy 



Lemma 5.1. For any j > 0, any K > A{K,j,e), and any real c, Euler- 
Maclaurin summation gives 



1 • 1 /■ ■ 

— T 2_, exp(27ricn^ + 2'Kixn) = — r / y^ exp(27rici/^ + 2-Kixy) dy + 

n=0 Jo 
m—1 ^ '' 

Lemma 5.2. For any e € (0, e~^) , any integer j > 0, any positive integer 
K such that K > h.{K,j,e), any x G [—3/4,3/4], and any real c satisfying 
IclK"^ < 1/48, we have 



max 

0<y<K 



< {QncK'' + 37r/2 + (2m + j)/Ky 



In particular, if M = [81ogif/e], then \Em\ < e. 
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Proof. We can write /i^j(y) = Px,c,mj{y) exp{2'Kicy^ +2TT ixy), where 

iv) 

is a polynomial in y of degree 2m + j (and a polynomial in x of degree m) . In 
particular, Px,c,m,j{y) has the form X^^IlV''' '^TxcV^' where (ij"^;''^ are polynomials 
in X of degree at most m. Let | J2i<m ^i\i,t ■— J2i<m l-^'ll^'l denote an Z^-norm. 
For zi{x) polynomials in x, define \J2i<m ^ii^)y'h,y^^ J2i<m\^ii^)\iMy'\- 
By induction, suppose that 

max \Px.c.rnAy)\ixv ^ {6ncK^ + 3tt /2 + {2m + j) / 

Then 

max |p;,^„,^.(y)| < K-\2m + j)i67TcK^ + 37t/2 + {2m + j)/ Kr 

0<y<K ' >J I i,x,y 

It follows that 

max \Px c m+i i{y)\^ < niax IGttc-v^Pj; c m 7"('V) L + 

^max^ |27ra:P,,,,„,,(y)|^_^ + ^max^ \PUmAy)\i,x, 
< (67rcif2 + 37r/2+(2m + 2 + j)/^)" 

□ 

Lemma 5.3. There exist absolute constants kt, Kg such that for any e G (0, e~^), 
any integer j > 0, any positive integers K , Ki satisfying K{K, j, e) < Ki < K , 
any a, b £ M/Z, any real c satisfying \cKl\ < 1/48, any 1 <m = 0{v{K,j,e)), 
any a G [—1,1], and any interval [w, z\ C [—1,1], the sum 



2m 



(2m) 



{fxZ/^ (Ki) - /i'"7" (0)) a - ax, b) dx (60) 



can be computed with error bounded by 0{h'{K, j, e)'^'' e) using 0{v{K, j^e)'^^) 
arithmetic operations. The big-0 constants are absolute and arithmetic is per- 
formed using 0{v{K,j,e)^) bits. 

Proof Write fx'^]j{y) = Py,c,m,j{x) exp{27ricy^ + 2nixy). Then Py.c,m,j{x) has 
the form X^IIlo ^Tyc^^^ where z^y^ are polynomials in y of degree < 2m+j. From 
theproof of LcmmalEl l^L^dh^ < {G-^cKf + 37r/2 + {2m + j)/Ki)"' = 0{{27:)^" 
So, in order to compute (|60|l . it is enough to compute sums of the form 

/ x'' exp{2TriKix)F{K;a — ax,b) dx / x' F{K; a — ax,b) dx (61) 

The sums (|6T|) are handled by Lemma FS-lOl □ 

Lemma 5.4. There exist absolute constants Kg, kiq such that for any e €E (0, e~^), 
any integer j > 0, any positive integers K, Ki satisfying A{K,j,e) < Ki < K, 
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any a, b E M/Z, any real c satisfying \cKi \ < 1/48, any a E [—1,1], and any 
interval {w, z) C (—1, 1) such that \w\ > 1/4, the function 



\ I I \'e^''"'y'-^''"yF{K;a + ax,b)dydx (62) 
K{ Jw Jq 



can be computed with error bounded by 0{v{K,j,e)'^^e) using 0{h'{K, j, e)'^^°) 
arithmetic operations. The big-O constants are absolute and arithmetic is per- 
formed using 0{iy{K,j,e)^) bits. 

Proof. Conjugating if necessary, we may assume that w is positive. Define the 
contours 

C'l :== {te-'^/^lO < t < 2Ki/V3}, Ca := {Ki - it\0 < t < Ki/VS} 
as well as Cq {t\0 < t < Ki}. Also define 

4(C) i / y3e^''"'y'~^''"y dy 
K{ Jc 

With this notation, ([5^ can be rewritten as f Ix{Co)F{K;a + ax,b)dx. By 
Cauchy's Theorem 4(Co) = h{Ci) - ^(Ca). But 

2Ki 

K{ Jq 



K{ 

where the constants di, d^ have modulus one. Since 2: > w > 1/4, cK\ < 1/48, 
the moduli of the integrands in both Ix{Ci) and /^(Ca) decline in the region 
[0,V2Ki] faster than 6"^/^. So, in both cases we can truncate the domain of 
integration with respect to y at L = v(K,j,e). Once truncated, the integrals 
with respect to y can be subdivided into L consecutive subintegrals. Taylor 
expansions can then be used to reduce to integrals of the form 

f f y^x''e~'^'''"''~''"''F{K;a + ax,b)dydx 
Jw Jq 

yl^s^-2^iK^x-2^nxp(^ J^.^ fl + aX, b) dy dx 



1 

I ^—2'jTiKix — 27Tnx j 

w JQ 

where 0<?i<L, 0<s = 0{v{K,j,e)), and < / = 0(z/(/C, j, e)). We 
integrate directly with respect to y. As for the integral with respect to x, make 
the change of variable nx — ^ a; if n > 0. Then, subdivide the resulting integral 
into n consecutive subintegrals. Finally, use Taylor expansions to reduce to 
integrals of the form ([6T|) , which are handled by Lemma 15.101 □ 
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Lemma 5.5. There exist absolute constants kh, ki2 such that for any e G (0, e~^), 
any integer j > 0, any positive integers K and Ki satisfying A(K,j, e) < Ki < K , 
any integer 1 < m ~ 0{Kf), any integer < p — 0{v{K, j, e)), any (3 £ [—m, m], 
any a £ [—rn, m], any A G [0, m], any 6 G { — 1, 1}, any rj G [—m, m], any inter- 
val (0,w) where < w — 0(1), and any real c satisfying \cKi \ < l/A{K,j,€), 
the integral 



1 

can be computed with error bounded by 0{v{K, j, e)''"if ~^e) using 0{v{K,j, e)''^^ ) 
arithmetic operations. The big-0 constants are absolute and arithmetic is done 
using 0{v{K, e)'^) bits. 

Proof. Let us first assume A = 0. Conjugating if necessary, we may also assume 
/3 is a non-negative number. We comment it is possible to replace the assumption 
9 G {—1,1} with 6* G [—1,1] say. We do not pursue this as the Lemma is already 
more general than is needed for the purposes of this paper. 

Define L := 8{p + l)'^i'{K,j, e). Suppose (3 < L. Then, make the change of 
variable a; Lx in (j63p . Subdivide the resulting integral into 0{L) consecutive 
subintegrals over the intervals [n,n + S\, where < n = 0{L) an integer and 
5 G (0, 1]. Note 5 is usually 1 except possibly for the last interval. For each 
subintegral, make the change of variable x ^ x — n. Then use Taylor expansions 
to reduce the problem to evaluating integrals of the form 

— / ' y3^2^^cy^+2^,-my f ^.g2«iil^^ ^g^) 

Kl Jq Jo 

where < s = 0{v{K,j, e)) an integer, and rji, ai = 0{Kf) real numbers. Split 
the integral with respect to y into two subintegrals over the intervals 

h {y e [0, K^] \\y-ea^\> 2(s + 1)L}, h [0, Xi] \ h 

If /2 is not empty, make the change of variable ai—9y — > j/, then eliminate the 
resulting cross term ^"^^^vl^ using appropriate changes of variable and Taylor 
expansions. Note since I2 is not empty, then \ol\\ < Ki + 2(s + 1)L. So, this 
procedure results in integrals of the form see below. These integrals can 
be evaluated efficiently. 

As for the interval Ii, integrate directly with respect to x in ([64]) to obtain 



{s + l-ry.Kf Ji, 



y\eai - yye^'^'^y +2'^*''2y dy (65) 



where l<r<s + l&Ti integer and 772 = 0(?7i + 1) a real number. Since 
\y — 9c(i \ > 2(s + 1)L for all y G /i, it is not hard to see that we can subdivide 
(|65|) into OiXogKi) subintegrals of the form 



.A+A 

/ y^{9ai-yye^'''''y +^''''^'ydy (66) 

J A 



{s + l-ry.Ki 
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where A e /i, A G (0,^], [A,A + A] C h, and \ A/A\ < 1/2, A := 0ai - A. 
Note \A\ > 2{s + 1)L. Make the change of variable y ^ y — A in ((66)) to obtain 
an integral of the form 



{s + l-ry.Kl 



Write {A — y) as ^ ' (1 — y/A) then apply Taylor expansions to the term 
(1 — y/A)^^. This reduces the integral ([67]) to the form 



(.s + 1 - r)lKl Jo 



where < /i = 0{i^{K, j, e)) an integer. Expand the term (A + y)^ as well as 
the exponent (y + A)"^ of e^^^<:(v+^) . xhe problem then reduces to evaluating 
integrals of the from 



A" 



yU^2TTicy +lTTirny +2iTtri3y~ l-ir py ^y ^gg^ 



where < u = 0{v{K,j, e)) an integer, A < Ki, < p, r/^ — 0{i]2 + 1) a real 
number, and 774 = 0(1) a real number. 

The integral ([55)1 is a variation on the Airy integral; see |GKj for example. 
We show how to compute it efficiently in Section 4.3 and Lemma [5.71 In fact, 
this integral can still be computed efficiently even if we relax some of the con- 
ditions on c, 773, and 774. Let us elaborate. If p = in (j69p . then consider two 
cases: either the integrand has a critical point or it does not; that is, either 
3cy^ + 2r]4y + 773 has a zero in [0, A] or it does not. In the former case, extract 
the saddle-point contribution as done in Section 4.3. In the latter case, use 
Cauchy's theorem to shift integration contours to the stationary phase of the 
integrand as done in Lemma [5771 If p > in (j69p . then the integral can still be 
computed efficiently; we only need to truncate it where appropriate, then apply 
Taylor expansions to obtain either integrals of the form (j69l) but with p = or 
other elementary and easily-calculable integrals. 

We may now assume (3 > L. For simplicity, we will also assume p = 0, 
which is in fact the only case needed for the purposes of this paper. Let 
/4 C [0, Ki] denote the interval satisfying (a — 9y)/{2f3) £ [0, w] for y e I4, and 
/5 := [0, i^i] \ /4. We wish to evaluate 



1 



yj^2Tricy +2iri7]y / ^2TTi(a-0y)x-2-!TiPx dj/ (70) 

Jo 



Let us do that over the domain {x,y) e [0, tw] x /s first. We may write 
I5 = Jg U /y, where Iq is the interval where a — 9y > 2f3w and Ij is the in- 
terval where a — 9y <Q. We evaluate ([70]) when y £ Iq first. Assume Ig is 
not empty. Let ei := K^^e. Write /g = I^i U /g, where /g is the subinterval 
of Iq where a ~ y > 2(3w + ei and ;= Iq\ Iq. Assume /g is not empty. 



28 



Then, Cauchy's theorem followed by the change of variable ^flfix x reduce 
the integral ((70)) when x e [0, w] and y e Jg to integrals of the form 

K{ Ji^ VW Jo 

TP 1 



where fj = 0{'q + 1) a real number, and 



, , a — 9y a — 9y — 2Pw 



2/3 

We evaluate the second integral in ([7T|) . the first integral being even easier 
to evaluate. Note that T2{y) > ei/y/2j3 > for all y in the closure of /e- So 
by Cauchy's theorem, the inner integral in the second integral in (j7ip can be 
replaced with 

00 



Let R := \^y8i'{K,j, e)] . Note we can truncate the integral (|72|) at i?. The 
truncated integral can then be subdivided into R consecutive subintegrals over 
intervals [n, n + 1], where < n < i? an integer. For each subintegral, make the 
change of variable x —> x — n, then apply Taylor expansions to quadratic factor 
in the integrand. This yields integrals of the form 



e 



2/3 



/ a;'?e"2'''='"^''^^(^)="-2'^"^dx (73) 
Jo 



where < n < R and < q — v{K,i, e) integers. Plug ([75]) back into the second 
integral in ([71]) . We obtain 

1 / , Jp27rici/^+2Tri))i/-27re'"/*'r2(j/)n /" „i3 „-27re'"'/*T2 (y)2:-27rnx j i /y^N 

miJie Jo ^ ^ ^ 



Split /g into two intervals I7 and /§, where /7 is the interval that satisfies 
|g'^V4^2(y) + n| < (7 and /§ '■— leXh- Note and is, as do all other intervals in 
this proof, consist of 0(1) many intervals of the form [s, t] where s,t E [0, Ki]. 

Over make the change of variable x —^ qx, if g > 0, and subdivide the 
resulting integral into q consecutive subintegrals over the intervals [m, m + 1], 
where < m < g an integer. For each subintegral, make the change of variable 
x —> x — m, then apply Taylor expansions to finally obtain integrals of the form 

1 f (e"/'^r2(j/) + n)'^ J 27Ticy''+2Trir,y-2Tre^*^*T2(y)(n+m/q) 



J I, q'' 



yj ^2TTicy'+2Trirjy-2Tre^'' T2(y)(n+m/q) / x''^'^ dx dy (75) 
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where < h = 0{v{K,j,e)) an integer. To conclude, directly integrate with 
respect to x in (|75p . then note the remaining integral with respect to y can be 
easily written in terms of integrals of the form ([M)) . which can be evaluated 
efficiently. 

As for the interval /g , directly integrate with respect to x in ([71)1 . This yields 
integrals of the form 

{q + l-ry.VWKlJiA(^^'^*My)+ny 



where l<f<9+lisan integer. Once again, the integral ()76p can be routinely 
reduced to integrals of the form ([69]) . 

It remains to evaluate the integral (l70|) over the domain {x, y) £ [0, w] x 1^^ . 
But the magnitude of ([70)1 over this domain is 0{K~'^e). Therefore, we can 
ignore the domain [0, w] x . Also, it is not too hard to see that the evaluation 
of the integral (|70|) when the domain of integration is set to [0,w] x Ij is similar 
to the case [0,w] x /g already considered. 

Lastly, we evaluate (|70|) when the domain of integration is restricted to 
{x,y) G [0,w] X I4. Note Ti{y) > whereas T2{y) < over I4. An application 
of Cauchy's theorem followed by the change of variable ^/2f3x x reduce (|70p 
over [0,w] x I4 to integrals of the form 

y3^27rtcy^+27rtvy / e^^^"" da; dy (77) 



/2pKi J I, 



/ y3^2^^cy''+2^^vy / e^^^^ (^)''+"^' rfx rf?/ (78) 

Jia Jo 



The integral ([75]) can be evaluated in a similar way to the second integral in ([7T|) . 
As for the integral (|771) , one can first compute it over the domains [2^/j3w, cx)) x J4 
and (—00,0] X I4. Over these domains, the integral with respect to x in (|77p 
declines exponentially in both its linear and quadratic arguments. It is then not 
too hard to see that over these domains, the integral (|77p can be reduced to the 
form ([69]) : also see Lemma [5^ So, instead of evaluating ([77]) over [0, x I4 

we evaluate it over (— cxd,cxd) x I4. This yields 

J —oo 

yj g2-!Ticy^+2-!ririy+TTiT^{y) ^jQ^ 



2(3 K{ Ju 
1 



2pKi 

The integral ([75)1 can be reduced to the form ([M]) . which can be computed 
efficiently. 

Finally, the treatment of 163]) when A > is analogous. □ 
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Lemma 5.6. There exist absolute constants K13 and ku such that for any 
e€ (0, e^^), any integer j > 0, any positive integers K and Ki satisfying 
A(K,j,e) < Ki < K, any a G [1/ A{K, j,e), Ki], any a\ G [0,4a], any real 
number a2 satisfying < \a2\ < ol and a2/a^ — 0{1), any a, 6 G R/Z, any 

real number ci satisfying \ciKi \ < l/A{K,j,e), any positive integer K2 sat- 
isfying, if possible, the bound 2A(K, j,e)^ < K2 < Ki/a, and for contours 
Co = {i|l/4 < t < 00}, = {t\~ 00 <t < -1/4}, the sum 

1 ""'^'^ 2 f^^ f . 

\ ^ ^2TTiak+2TTibk j I yj ^2TTiciy -2TTixy ^2TTi(ak-ai)x-2ma2X dy (80) 

Ki Jo Jc 

where Co G {Co, Co}, can be computed with error bounded by 0{v{K,j, eY^^ K^"^ t) 
using 0(y{K, j,e)'^^'^ ) arithmetic operations. The big-O constants are absolute 
and arithmetic is done using 0{h'{K, j, e)^) bits. 

Proof. For simplicity, assume A — A{K,j,e) is an integer. Since a2 ^ and 
y G [0,Ki],\Ki\ < oo, the integral with respect to x in ([SO]) is uniformly bounded 
in y. Thus, the order of integration in (|80|) does not matter. It is sufficient to 
compute (|80l) over Cq, because over Co we essentially obtain a conjugate sum. 
By the change of variable x ^ x — 1/4, the sum ([80|) is then reduced to the form 

^2niaik+2Tribk^ / / yj ^2TTiciy^ -Tiiy /2-2TTixy ^2TTiTkX~2TTia2X^ ^y ^g^^ 

Ki ^^^2 Jo Jo 

where ai := a + a/4 and Tk := ak ~ ai ~ 0:2/2. Note Tk > a(A'^ — 5) > A — 1 
for k> A^. Define 

Ci := {te-'^'/^lQ < t < 2K1/V3}, C2 := {Ki - it\Q < t < Ki/Vi} 

By Cauchy's Theorem we can write the integral with respect to y in ([5T|) as the 
difference of integrals over Ci and C2 . Starting with the integral over Ci , the 
sum (ISTI) becomes 



J_ g27rMifc+27ri6fc2 f f yj ^27TCiy'' -Tve^'^^y /2-2TTe'"^^ xy ^27TiTkX~2TTia2X^ ^y 

Ki ^^^2 Jo Jo 

The last integral with respect to y declines faster than e~'^/^^ over [0, 2Ki/\/3]. 
So, we can truncate with respect to y at L = ll6iy{K, j,e)\, then use Taylor 
expansions to reduce to a sum of the form 

poo pi 

g27rioifc+27ri6fc^ / / yl ^-ne^'''^ {y+n)x ^2TriTkX-2Tria2X^ ^y ^g2) 

where < n < i and < / = 0{v{K,j, e)) integers. Define 

C3 := {e-"/^a;|0 < x < 00}, C4 := {e"/'^a;|0 < x < 00} 
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If Q!2 is positive, then 



'"^ g-7re'''/3(a+n)(T-it)g27riTfc(T-it)-27riQ2(T-it)= 



So, we can integrate x in (|82p over C3 instead of (0,oo). Similarly, if a2 is 
negative, then shift to C4. In any case, we obtain a sum of the form 



^2 -A" .00 rl 



g27rmife+27ri6fe^ / / 2^'g27rie- = ''"(°2)'''/4(^^__e-'''/<'(y+n))a;-27r|a2|2:^ ^2; (83) 

. •> Jq Jq 



We consider both possibilities in (|83p: 0:2 positive or q;2 negative. In the 
former case, the sum (|55|) becomes 



2 />00 /•! 

. , Jo Jo 



fe=A2 

For y G [0,1] and > we have 

3?{e"/Vfe - e"/i2(y ^ „)| > - n - 1 > A/3 (85) 
In particular, in the "complementary" sum 

K2-A^ 1 

y^ g27riaifc+27ri6fc^ / / 

the integrals with respect to x can be truncated at x = —1, which yields 

ifa-A^ ,1 1 



<'2-A „o /■! 

I_AQ J —OQ J 



^27Tiaik+27Tibk^ / ^ig-27r(e'"/Vfc--e''*/i^(y+«))a;-27i-|a2|a;= ^y^^ 

Apply Taylor expansions to e^'^'^'"^^^^'^^, then integrate with respect to y to reduce 
to the form 

-R'2-A^ „i 
y^ g27riaifc+27i-ihfc^ / ^Pg-27r(e'"/*Tfc-e'"/i=n)a;-2ii-|Q2|a;= ^gy) 

where < p = 0{i'{K,j, e)) an integer. Note 

sffje'^^/Vfe - e'^^/i^n} > ^(A^ - 5) - 16i/, 1/A < 02 < a 

Thus, we can truncate the integral (|87p with respect to x at min {1,q;~1}, the n 
eliminate the quadratic factor £-'^^\<^2\x ^gjjig Taylor expansions. Some fur- 
ther simple manipulations finally yield sums of the forms G{K,l] a,b), where 
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< ^ = 0{v(K,j, e)) is an integer. Such sums can be computed efficiently; see 
Section 1. 

Now, suppose a2 is negative in ([55)1 . Note 

3fi{e3-/4^^ - e'^'/'^{y + n)} < + (y + n) 

So, when a2 is negative the real part of the coefhcicnt of X in dMl) is < -A/3 
for all y £ [0, 1] and fc > A^. Thus, we essentially obtain an integral of the form 
Put together, our problem has been reduced to computing 

where a2 > 0. Integrating with respect to x in (j88p yields a sum of the form 



._A2 Jo 



where 02 — 0{a^/a2) and 61 = 0{a^/a2) are real numbers. Of course, both 
a2 and 61 can be reduced mod 1. We point out their "raw magnitude" to show 
that they can be expressed in 0{v{K,j,e)^) bits throughout, as do all other 
numbers in this paper. The integrand in (|89p declines very rapidly over [0, 1]. 
In fact, it is completely negligible if n > 0. So, we can assume n = 0. We 
truncate the integral with respect to y at 0.2/0., then eliminate the quadratic 
factor e^"^ ^ /(2a2) using Taylor expansions . Finally, some further simple 
manipulations reduce the problem to evaluating sums of the form G{K, I; a, b), 
where < I = 0{v{K, j, e)) is an integer. Such sums can be evaluated efficiently; 
see Section 1. 

Having shown how to deal with the integral over Ci , we now move on to the 
contour C2 ■ Here we reduce to sums of the form 

1 r°° rik 

^2^^a^k+2^^bk^ / (Ri ^ tyY X (90) 

Jo Jo 

^6TTCiK'fy-6TriciKiy^-2TTCiy^-TTy/2-2-!rxy^--2Trif^x-27Tia2X^ 

where fk Ki - Tfc. Observe that fk > a{A'^ - 5) > A - 1 for fc < 1^2 - A^. 
Truncate the integral with respect to y in ([50)1 at L = [16i^(i^, j, e)J , then use 
Taylor expansions to reduce to sums of the form 



-^1 fc=A2 



g7riaifc+27ri6fe^ / / yl ^-2TTx(y+n) ^-2TTiTkX-2TTia2x'^ ^g^^ 
fc=A2 ^ Jo 

where < n < L and < ^ = 0{i^{K,j,e)) are integers. Finally, ([9T|) is treated 

in a completely analogous way to (|82p. □ 
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Lemma 5.7. There exist absolute constants K15 and kiq such that for any 
e € (0, e~^), any integer j > 0, any positive integer K > A[K,j,e), any posi- 
tive integer N < K, any real c satisfying l/A{K,j,e) < |c-/V^| < 2/A{K,j,e), 
any real number a and positive integer M satisfying a G [1 / A{K, e), N /M] and 
A{K,j,e)^ < M < N/a, any ai € [0,4a], anyreala^ satisfying \a2\ G [1/K'^,a], 
any integer k e [T,M - T] where T := [A{K, j, e)^J , and for 

y'^ ak-ai {ak - ai f 
Uy)=My,c,a,a„a2):=cy - — + ^—y-^-— 

yk = yk{c,a,ai,a2) := ^ \/l - 24:ca2{ak - ai)^ 

hk{y) = hk{y,c,a2) := cy^ + (scyk - y^ 



1 

hj = Ikj{c, a, aua2, N) := / 6^^"^"^^-^''^ dy 

a/2 a2 iV-' Jo 



the function 

can be written in the form 

L ji L ji 

T, . - ~, I r.27r»aifc+27r»6ifc^-27ri/fcfafc) •Sp„,. ^ 

;=o 1=0 



1 

_^^2niaik+2wibik^ -2nifk{vk) yj^ _ _|_ ^2-nia2k+2nib2k^ -2wifkiyk) U; — 



=0 

L 



^g27ria2fe+27ri62fc"-27riA(,/fe) ^ - 1 ^ 0{u{K,j, el'^'^K'^e) 

1=0 

where L = 0{h'{K,j,e)'^'^'^), the numbers ai,bi,a2, and 62 are in R/Z and are 
independent of k, the coefficients zi, wi, and vi are independent of k and are 
bounded by 0{1), and the coefficients wi, and vi are independent of k and 
are bounded by 0{A{K, j,ey). The absolute constant Kig can be taken to be 
equal to 6. Finally, big-O constants are absolute and arithmetic is done using 
0{y{K,j,ef) bits. 

Proof. First, observe 

oc 

yk = ^ui{2Aca2)''~'^{ak-aif 

1=0 

= Y.a'k'{2^ca2Y-^Y.ui+s[ ){'2^cara2y (92) 
s=0 1=0 V * / 
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where uq — 0, ui :— 1, and \ui\ < 1 for / > 1. So, 

oo 

Vk = ^gsa'k', where (93) 

s=0 

oo + S + 1\ 

gs := u424ca2)""^ +ai(24ca2)"^u/+s+if * j(24caia2)' 

1=0 \ ^ / 

In particular, jgol < 2q;i, 51 = 1 + 0(Af'^), and for s > we have 

00 

< |24ca2|" + 2ai|48ca2|"+i^|48caia2|' 

1=0 

< I I I < ? (Q4) 

- {MNY M^{MNy - {MNy ^ ' 

Therefore, yk = ok + Ai, where |Ai| < 20a. So, for T < A: < M - T, we have 
a(T - 20) <yk < N - a{T - 20). We spHt 4^ into the two integrals 

/I ^ := r^'" ^y±±y)L^^^^.iy) dy, II , := r ^y±^,^^^>^.i-y) dy 

We show how to evaluate /^^ first. Conjugating if necessary, we may assume 
the parameter a2 that appears in the definition of hk{—y) is negative (note 
conjugation switches the sign of c also). By Cauchy's Theorem 

j2^^ / ^ '-yfc ^ p27ri/it(-e"'/^y) ^ ^ I V p27Trhk{^[vk + 



y2MiVJ- Jo y2KT^-'' 



Simple manipulations then yield 

Ih = ^'''Ik] + (-z)^+ie2-^(^'=)7^;; (95) 



where 



5(y) 2cy^ - -p— 

4a2 



,2,1 .„ Z^"'' (;/fc - e"/^;/)-'- _2.(3c,, + ^),= +2.e''/-cy^ 





■ io v/2|^A^^" 

Note that if ^- was conjugated in order to ensure a2 was negative, then once 
we reach ([^5)) . we would conjugate back. So, the coefficients 02 and c that 
appear in the definition oi g{y) above are the same as the coefficients that occur 
in the statement of the Lemma. Now, 



1 v/l + 24c|a2|(afc-ai) ^ 1 

3c?/fc + : = . > 



4|a2| 4|a2| 8|a2| 
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Also, the condition k>T ensures that ^/2yk > \/2\a2\[VT\ ■ Let us show how 
to compute /^'j first. Truncate I^'j at •\/2|q;2| IVT\ , then ehminate the factors 
e and e~ using Taylor expansions. We reduce to 



Jo ^/2\a2\N3 

where < d = 0{i^{K,j,e)), < v = 0{v{K,j,e)) are integers. Next, make a 
change of variable y/ ^J2\a2\ y- This simplifies the problem to evaluating 

i6nc\a2\y,n2.c\a2\^/r l'^' ^^^^^ " ^"'^/^"^ "^^^e-^- (96) 

Next, expand the factor (y^ — e'^^^'^^/2\a2\yy in We reduce to expressions 
of the form {y\^^ /N^~^^)Ijj^y^d where < ^ < j is an integer and 

i6ncN\a2\n2nc\a2f/Y[]) ^^^""ifSi^^^' j^^^ y^-^+'^+^'e^^y' dy 

Note that Ij,i,v,d is bounded by 0(1), is independent of k, and can be eas- 
ily computed with error bounded by 0{K~'^t) using 0{K{K,j,e)) operations. 
Therefore, we may turn our attention to the term 2/^+"/iV'+"'. By ([92|), ([M]) . 
and ([94]) . we have 

where R := l3iy{K,j,e)\, \ho\ < S/M, \hi\ < 1 + 0{M-^), and for s>2 



N 



2iV" 2 

< ^ T < 



N{MNy-^ - M 



s-l 



Let 5 :— '^f^2 ^sjfT- For any integer < q — 0{iy{K,j,e)) and any number 



^s=2 '"s M 

0<S = 0{K-^e) we have 



9 , 

9 



{(5 + ho + hik/M) + Sy = ^ rj(,5 + /io + /iifc/M)«~^'5f 

q Rp ,g 

= J2(s + ho + h,k/My-pJ2k,,— 

p—0 s—2p 

where hs^p = 0{RP/M'^^) = 0{l/A{KJ,e)) for s > 2p. So, 
((5 + /lo + /iifc/M) + 5)« = ^(/lo + hik/My-P y h,^p— + 0{K-h) 

p—0 s—2p 
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Therefore, we can expand {yk/NY^'" as a power series ink/M of length 0{v{K,j, e)^). 
The truncation error is 0{K^^€) and coefficients of the power series are bounded 
hyO{l/v{K,j,e)). 

Having shown how to compute /^.^ , we now deal with 7^,'^ . Before doing so, 
let us first observe 

9{v) + fkiy) = yfkiy) ^ diyk) = ~fk{yk) 

4a2 4a2 
Thus, the factor e^'^'aivk) ^ 

can be replaced with a factor of the form 
g27r»aife+27ri6i/c^-27r</fc(yO. Wc truucate /f'] at ^/2\a^\, then make the change of 
variable y/y^2\a2\ y- We reduce to an integral of the form 

(2\a2\y^^y^ -eivcy/2\a2\yky yM^yky--^ty^~27rc{2\a2\f^''y'' , 

Jo 

Next, eliminate the factors e~'^*^ and e"^'^'^^^!"^!) '' y^ using Taylor expansions. 
This yields integrals of the form 



yj+"e V ^ V2i°2iy dy (97) 



where < u = 0{i'{K,j, e)) is an integer. Integrate with respect to y in to 
obtain terms of the form 

{2\a2\y/^j + uy. , , r— ^ 1 

;(2na) y^. , a := 6c\/2\a2\ 



where 1 < f < J + u + 1 is an integer. But 



_n 1 + J I - 2Aca2{ak - ai) 

Vk ^ — 



2(ak — ai) 
1 1 _ (ak — aiY 

— rvi ^ — ^ 



ak — ai ak — ai N' 

s—l 



where gs < {2Aca2Ny < Therefore 

{27ra)-^{2\a2\y/Hj + u)l _^ _ C,.r,u ^ fr\ {ak - a^Y 

-Vk - „.Ar 2^ L„ Z^5s 



Nm + u+l-r)l (ak-aiY ^ \w X'^" 

_ C'j,r,« _ ? {ak - axY 



{ak-axY ^ ^ 

where \Js,w\ ^ r^s^Af"^ = 0(Af~^/^) for s > w, and where the constants 
C,,r.u^O{{3+uY{2aYl^). So, 

(27ra)-- (2|a2|y/^ (j + ^)! _ Q...^ ^ ^ - {ak - a^Y , ^ 
7VJ (j+«+l-r)! ^''^ - (afc _ ci)- ^ 2^ iV- ^ > 
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where R = [Sv^K, j, e)\. Some farther simple manipulations give the result in 
its final form. 

As for Ilj, its computation mirrors that of /|^-. The details of treatment 
differ from those of /| ^ mainly in so far as we appeal to the bound k < M ~ T 
rather than k > T. □ 

Lemma 5.8. For any integers n and R satisfying < n < R the quantity 

2( Lfl/n+3j - rii/2n+3/2] +l)(i?/n+3- lR./2n+3/2'] )n ^ 
\R/2n+3/2']-l lB./n+3\ - {R/2n+3/2'] 

Y[ 2"" Yl 2-^*" (98) 

s=0 s=0 

is dominated by 2"(«/2«+3/2)' . 

Proof. For x E M, let x x (mod 1) such that x E [0, 1). It is easy to verify 
2\x'] - I2x\ ^ 2^3.5(04/2) + (5iEe[i/2,i) 




[R/n + 3J - \R/2n + 3/2] + 1 if R/2n + 3/2 E (0, 1/2) 
[R/n + 3\ - \R/2n + 3/2] if R/2n + 3/2 E [1/2,1) 

[i?/n + 3j - [i?/2n + 3/2] - 1 if i?/2n + 3/2 = 



Suppose R/2n + 3/2 E (0,1/2). Define A {R/2n + 3/2}. By hypothesis, 
A > 0; in particular, \R/2n + 3/2] = R/2n + 3/2 - A + 1. So, ^ is equal to 

2(lR/n+3\-{R/2n+3/2']+l){B^/n+3-{FI./2n+3/2'])n2(JR/2n+3/2']-l)n 
_ 2(-'<'/2"+3/2-A)(i?,/n+4-(i?y2n+3/2-A+l))n 
_ 2"(-R/2"+3/2)^-nA^ 



Now, suppose i?/2n + 3/2 E [1/2,1). Define A := 1 - {i?/2n + 3/2}. By 
hypothesis A > 0; so \R/2n + 3/2] = R/2n + 3/2 + A. Then, ^ is equal to 

2( L-R/n+3j - r-R/2n+3/2] +l)(i?/n+3- {R/2ji+3/2'] )n 
^ 2(-'*^/2"+3/2+A)(fl/n+3-(fl./2ri+3/2+A))ra 
_ 2"(-R/2"+3/2)^-nA^ 



Finally, suppose R/2n + 3/2 = 0. Then, R/n + 3 = and ((Ml) is equal to 

2ii(i?/2n+3/2)^ Q 

Lemma 5.9. There exist absolute constants K17 and nis such that for any 
e E (0, e~^), any integer j > 0, any integer Ki satisfying A(K,j,e) < Ki < K , 
any integer K2 satisfying Ki < K2 < K, any a E [0, 1], any ai E [0,4a], any 
real ai satisfying < |a2| < ct, any real ci satisfying \ciKi\ < 1/48, and any 
a, b satisfying 

(a + ax, b) E (0, 2) x [0, 1/4] for all x E [-1/4, 1/4] 

\a + axl > [a + ax + 2bK2\ for some x E [-1/4, 1/4], (99) 
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the sum 

— / / 2/^e2'^*'=i^'-2^"^e"2'^*"i^^2^'"^^'F(i^2;a + aa;,6)di/da; (100) 

K{ J-l/4 Jo 

can be computed with error bounded by 0{u{K,j,e)'^^''e) using 0{v{K,j,e)'^^^) 
arithmetic operations. The big-O constants are absolute and arithmetic is done 
using 0{y{K,j,eY) bits. 

Proof. Since a < 1, then ai < 4 and |a2| < 1. So, (|100p may be reduced via 
Taylor expansions to the form 

— / y^xPe^'^'^'y -^''''■'yF{K2;a + ax,b)dydx (101) 

K{ J-l/4 Jo 

where < p = 0{h'{K, j,e)) is an integer. The hypothesis ([M)) imphes that 
< a + ax + 2bK2 < 4 for aU x G [-1/4, 1/4], so 2bK2 < 4. Without loss of 
generality, we may assume that K2 is a multiple of 32. Subdivide F{K2', a+ax, h) 
into 32 consecutive subsums to reduce the problem to evaluating sums of the 
form 



1 



1/4 /-ifi 



where < m < 32 an integer, am := a + bmK2/16, Km '■= amK2/i2, and 
:= K2/i2 — 1. We can normalize a,„ so that am + aa; £ [—3/4, 3/4] for all 
a; G [—1/4, 1/4]. In particular, we can assume |am| + |aa;| + 2bK'i < 7/8 for all 
X G [—1/4,1/4]. Given this, we can apply Euler-Maclaurin summation to the 
sum F{Ks] ai+ax, b) in (|102p . The problem is essentially reduced to computing 
the integral 

1 /•1/4 j-Ki j.K3 _ 

/ / / yj ^p^2-Kiciy'-2vixy+2TTiKmX^2maxz+2TTiarnZ+2-!vibz fjly fjl^ 

K{ J- 1/4 Jo Jo 

Finally, Cauchy's Theorem as well as saddle-point techniques similar to the ones 
we carried out before allow to reduce the last integral to integrals of the type 
handled by Lemma 15.51 □ 

Lemma 5.10. There exist absolute constants K19 and K20 such that for any 
e G (0,e~^), any integer j > 0, any integer K > A{K, j,e), any integer K\ 
satisfying < Ki < K , any a,b G K/Z, any integer < I = 0{v{K,j, e)), any 
a G [—1, 1], and any /3 > 0, the sum 

I x^e^'''^^''F{K;a + ax + iXx,b)dx (103) 
Jo 

can be computed with error bounded by 0{v{K,j,e)"'^^e) using 0{i'{K, j, e)'^™) 
arithmetic operations. The big-O constants are absolute and arithmetic is done 
using 0{h'{K, e)^) bits. 
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Proof. Let us first assume that A = 0. We also assume that |q;| > 1/K, otherwise 
the sum reduces to the form F{K; a, b), which can be computed efficiently; see 
Section 1. 

Let M denote the set of integers {k S [0, K]\ \Ki + ak\ < 2{l + 1)}. If M is 
not empty, then it has the form [s, t], for some integer s, i e [0, K]. So, consider 
the sum 



Jo 



(104) 



Make the change of variable x ^ 2{l + l)x in (|104p . Then, subdivide the 
resulting integral into 2(^ + 1) consecutive subintegrals of the form 

27riafe+27rifcfc2 + 27Ti(Ki+ak)x/(2(l+l)) ^ Qgr-i 

^0 (2(^ + 1))' ^ ' 

where < n < 2(Z + 1) an integer. Next, apply Taylor expansions to the term 
^27,t{Ki+ak)x/ (2(1+1)) g^-^^ expand {x + nf to reduce pil5| to a sum of the form 

^, (^(/^ E("fc + + i^O'^e^"''^'"'"' dx (106) 

where Q < p — 0{v{K,j.e)) and i) < q — 0{h'{K, j, e)) are integers. Note by 
construction, \as + Ki\ < 2{l + 1) and \a{t — s)\ < 4(^ + 1). We may assume 
t — s > 0. Now, expand the term {ak + as + KiY to obtain sums of the form 

d) q\{2{l + \))i Jo ^ ^ 



where < d < g an integer. Integrate directly with respect to x in (I107p . 
then note that the constant infront of the sum ()107|) has magnitude less than 
5''|q!|'^/((7! {l+l)'^) < /{t—sY- So, the resulting quadratic sums can be handled 
by Theorem 12. 11 see Section 1. 

It remains to consider the (possible empty) intervals [0, s — 1] and + 1, _fC]. 
Over these intervals, directly integrate with respect to x in (|103p . We obtain 
sums of the form 

^1 g27riafc+27ri6fc^ 

(/ + 1-^;)! ^ {Ki+akY 

^ k—u ^ 

where 1 < w < Z + 1 is an integer and \Ki + afc| > 2(/ + 1) for all integers 
k G [u,r]. It is then not too hard to see that the sums (jlOSp can subdivided 
into O(logii') sums of the form 

7| N+A 2iTidk+2-!iibk^ 



{l + l-vy. {Ki+akY 



where N £ [u,r) an integer, A £ [0, iV] an integer, \Ki + aN\ > 2{l + 1), 
\A/{Ki + aN)\ < 1/2, plus a remainder sum of length 0{logK) that can be 
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computed directly. In particular, the sum (|109p can be reduced via Taylor 
expansions to sums of the type handled by Theorem 12. 11 see Section 1. 

Finally, the treatment of (|103p when A > is analogous. □ 
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